Overview

Dataset statistics

Number of variables25
Number of observations300000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory31.2 MiB
Average record size in memory109.0 B

Variable types

Numeric11
Categorical14

Alerts

zip is highly correlated with lat and 3 other fieldsHigh correlation
lat is highly correlated with zip and 3 other fieldsHigh correlation
long is highly correlated with zip and 3 other fieldsHigh correlation
merch_lat is highly correlated with zip and 3 other fieldsHigh correlation
merch_long is highly correlated with zip and 3 other fieldsHigh correlation
hour is highly correlated with category_gas_transport and 1 other fieldsHigh correlation
category_gas_transport is highly correlated with hourHigh correlation
category_grocery_pos is highly correlated with hourHigh correlation
amt is highly skewed (γ1 = 56.19402201) Skewed
hour has 9746 (3.2%) zeros Zeros
day has 59974 (20.0%) zeros Zeros

Reproduction

Analysis started2022-11-09 00:00:06.786425
Analysis finished2022-11-09 00:02:22.719329
Duration2 minutes and 15.93 seconds
Software versionpandas-profiling v3.3.0
Download configurationconfig.json

Variables

amt
Real number (ℝ≥0)

SKEWED

Distinct30525
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.12587797
Minimum1
Maximum28948.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:22.939225image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.44
Q19.66
median47.44
Q383.14
95-th percentile196.311
Maximum28948.9
Range28947.9
Interquartile range (IQR)73.48

Descriptive statistics

Standard deviation162.4457824
Coefficient of variation (CV)2.316488393
Kurtosis7709.998671
Mean70.12587797
Median Absolute Deviation (MAD)37.45
Skewness56.19402201
Sum21037763.39
Variance26388.63222
MonotonicityNot monotonic
2022-11-08T16:02:23.201850image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.14142
 
< 0.1%
1.12134
 
< 0.1%
1.2133
 
< 0.1%
1.02133
 
< 0.1%
1.22130
 
< 0.1%
1.7128
 
< 0.1%
1.09127
 
< 0.1%
1.16127
 
< 0.1%
1.1125
 
< 0.1%
3.71124
 
< 0.1%
Other values (30515)298697
99.6%
ValueCountFrequency (%)
149
 
< 0.1%
1.01112
< 0.1%
1.02133
< 0.1%
1.03116
< 0.1%
1.04116
< 0.1%
1.05117
< 0.1%
1.0697
< 0.1%
1.07109
< 0.1%
1.08115
< 0.1%
1.09127
< 0.1%
ValueCountFrequency (%)
28948.91
< 0.1%
27119.771
< 0.1%
22768.111
< 0.1%
13149.151
< 0.1%
11422.831
< 0.1%
11371.951
< 0.1%
9197.471
< 0.1%
8517.381
< 0.1%
7886.261
< 0.1%
7288.51
< 0.1%

zip
Real number (ℝ≥0)

HIGH CORRELATION

Distinct963
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48893.8212
Minimum1257
Maximum99921
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:23.469391image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1257
5-th percentile7439
Q126292
median48193
Q372042
95-th percentile94569
Maximum99921
Range98664
Interquartile range (IQR)45750

Descriptive statistics

Standard deviation26849.73684
Coefficient of variation (CV)0.5491437604
Kurtosis-1.094036271
Mean48893.8212
Median Absolute Deviation (MAD)23039
Skewness0.07646239989
Sum1.466814636 × 1010
Variance720908368.2
MonotonicityNot monotonic
2022-11-08T16:02:23.745464image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
82514839
 
0.3%
73754829
 
0.3%
48088817
 
0.3%
34112815
 
0.3%
58569771
 
0.3%
21872759
 
0.3%
72042753
 
0.3%
49628750
 
0.2%
49895739
 
0.2%
38761736
 
0.2%
Other values (953)292192
97.4%
ValueCountFrequency (%)
1257455
0.2%
1330251
 
0.1%
1535120
 
< 0.1%
1545221
 
0.1%
1612113
 
< 0.1%
1843629
0.2%
1844453
0.2%
2180133
 
< 0.1%
2630452
0.2%
2908143
 
< 0.1%
ValueCountFrequency (%)
999211
 
< 0.1%
99783364
0.1%
997473
 
< 0.1%
99746118
 
< 0.1%
99323622
0.2%
99160714
0.2%
991162
 
< 0.1%
99113260
 
0.1%
99033583
0.2%
98836126
 
< 0.1%

lat
Real number (ℝ≥0)

HIGH CORRELATION

Distinct975
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.53826613
Minimum20.0271
Maximum66.6933
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:24.011115image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum20.0271
5-th percentile29.8826
Q134.6205
median39.3543
Q341.8948
95-th percentile45.8433
Maximum66.6933
Range46.6662
Interquartile range (IQR)7.2743

Descriptive statistics

Standard deviation5.076917293
Coefficient of variation (CV)0.1317370448
Kurtosis0.7906627346
Mean38.53826613
Median Absolute Deviation (MAD)3.3677
Skewness-0.1848757267
Sum11561479.84
Variance25.7750892
MonotonicityNot monotonic
2022-11-08T16:02:24.288815image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43.0048839
 
0.3%
36.385829
 
0.3%
42.5164817
 
0.3%
26.1184815
 
0.3%
46.1838771
 
0.3%
38.4121759
 
0.3%
34.2853753
 
0.3%
44.5995750
 
0.2%
46.3535739
 
0.2%
33.4783736
 
0.2%
Other values (965)292192
97.4%
ValueCountFrequency (%)
20.0271370
0.1%
20.0827208
 
0.1%
24.6557590
0.2%
26.1184815
0.3%
26.3304120
 
< 0.1%
26.3771135
 
< 0.1%
26.4215693
0.2%
26.4722602
0.2%
26.529332
0.1%
26.6939261
 
0.1%
ValueCountFrequency (%)
66.69333
 
< 0.1%
65.6899118
 
< 0.1%
64.7556364
0.1%
55.47321
 
< 0.1%
48.8878714
0.2%
48.8856493
0.2%
48.8328367
0.1%
48.6669235
 
0.1%
48.6031732
0.2%
48.4786471
0.2%

long
Real number (ℝ)

HIGH CORRELATION

Distinct976
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.2547715
Minimum-165.6723
Maximum-67.9503
Zeros0
Zeros (%)0.0%
Negative300000
Negative (%)100.0%
Memory size2.3 MiB
2022-11-08T16:02:24.553264image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-165.6723
5-th percentile-119.0825
Q1-96.798
median-87.4769
Q3-80.1752
95-th percentile-73.5365
Maximum-67.9503
Range97.722
Interquartile range (IQR)16.6228

Descriptive statistics

Standard deviation13.73774947
Coefficient of variation (CV)-0.1522107833
Kurtosis1.838306623
Mean-90.2547715
Median Absolute Deviation (MAD)8.1276
Skewness-1.145172422
Sum-27076431.45
Variance188.7257604
MonotonicityNot monotonic
2022-11-08T16:02:24.812638image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-108.8964839
 
0.3%
-98.0727829
 
0.3%
-82.9832817
 
0.3%
-81.7361815
 
0.3%
-101.2589771
 
0.3%
-75.2811759
 
0.3%
-82.7243758
 
0.3%
-91.3336753
 
0.3%
-86.2141750
 
0.2%
-86.6345739
 
0.2%
Other values (966)292170
97.4%
ValueCountFrequency (%)
-165.6723364
0.1%
-156.292118
 
< 0.1%
-155.488208
0.1%
-155.3697370
0.1%
-153.9943
 
< 0.1%
-133.11711
 
< 0.1%
-124.4409222
0.1%
-124.2174372
0.1%
-124.1587228
0.1%
-124.1437336
0.1%
ValueCountFrequency (%)
-67.9503474
0.2%
-68.5565235
 
0.1%
-69.2675107
 
< 0.1%
-69.4828501
0.2%
-69.9576122
 
< 0.1%
-69.9656647
0.2%
-70.10314
 
< 0.1%
-70.239194
 
0.1%
-70.23945
 
< 0.1%
-70.3001452
0.2%

city_pop
Real number (ℝ≥0)

Distinct873
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean88514.15292
Minimum23
Maximum2906700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:25.075214image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile139
Q1741
median2435
Q320328
95-th percentile518429
Maximum2906700
Range2906677
Interquartile range (IQR)19587

Descriptive statistics

Standard deviation302042.5067
Coefficient of variation (CV)3.412363975
Kurtosis37.96721111
Mean88514.15292
Median Absolute Deviation (MAD)2180
Skewness5.622829603
Sum2.655424588 × 1010
Variance9.122967585 × 1010
MonotonicityNot monotonic
2022-11-08T16:02:25.340694image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6061291
 
0.4%
13129221201
 
0.4%
15957971165
 
0.4%
17661026
 
0.3%
2411008
 
0.3%
2906700985
 
0.3%
1126966
 
0.3%
302956
 
0.3%
198953
 
0.3%
276002935
 
0.3%
Other values (863)289514
96.5%
ValueCountFrequency (%)
23472
0.2%
37240
 
0.1%
43470
0.2%
46685
0.2%
47114
 
< 0.1%
49232
 
0.1%
51242
 
0.1%
52123
 
< 0.1%
53644
0.2%
60219
 
0.1%
ValueCountFrequency (%)
2906700985
0.3%
2504700466
 
0.2%
2383912118
 
< 0.1%
15957971165
0.4%
1577385587
0.2%
1526206811
0.3%
1382480472
 
0.2%
13129221201
0.4%
1263321847
0.3%
1241364587
0.2%

merch_lat
Real number (ℝ≥0)

HIGH CORRELATION

Distinct297367
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.53994798
Minimum19.027849
Maximum67.397018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:25.607864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum19.027849
5-th percentile29.74189105
Q134.7378955
median39.368308
Q341.95946525
95-th percentile46.02228925
Maximum67.397018
Range48.369169
Interquartile range (IQR)7.22156975

Descriptive statistics

Standard deviation5.111003209
Coefficient of variation (CV)0.1326157267
Kurtosis0.7749178973
Mean38.53994798
Median Absolute Deviation (MAD)3.402611
Skewness-0.1825545615
Sum11561984.4
Variance26.12235381
MonotonicityNot monotonic
2022-11-08T16:02:25.868491image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39.241333
 
< 0.1%
38.5568773
 
< 0.1%
38.9867493
 
< 0.1%
38.7163263
 
< 0.1%
40.1374183
 
< 0.1%
41.9304163
 
< 0.1%
38.9335013
 
< 0.1%
43.3274953
 
< 0.1%
43.4799573
 
< 0.1%
42.8907713
 
< 0.1%
Other values (297357)299970
> 99.9%
ValueCountFrequency (%)
19.0278491
< 0.1%
19.04191
< 0.1%
19.0447471
< 0.1%
19.0452771
< 0.1%
19.0481241
< 0.1%
19.0608751
< 0.1%
19.0628881
< 0.1%
19.0722891
< 0.1%
19.078441
< 0.1%
19.0796071
< 0.1%
ValueCountFrequency (%)
67.3970181
< 0.1%
67.1881111
< 0.1%
66.6829051
< 0.1%
66.6792971
< 0.1%
66.671541
< 0.1%
66.6646731
< 0.1%
66.6246741
< 0.1%
66.6095251
< 0.1%
66.5998061
< 0.1%
66.5957821
< 0.1%

merch_long
Real number (ℝ)

HIGH CORRELATION

Distinct298924
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-90.25348029
Minimum-166.670685
Maximum-66.950902
Zeros0
Zeros (%)0.0%
Negative300000
Negative (%)100.0%
Memory size2.3 MiB
2022-11-08T16:02:26.128724image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-166.670685
5-th percentile-119.2618549
Q1-96.908114
median-87.4729205
Q3-80.27720875
95-th percentile-73.3890317
Maximum-66.950902
Range99.719783
Interquartile range (IQR)16.63090525

Descriptive statistics

Standard deviation13.75125306
Coefficient of variation (CV)-0.1523625794
Kurtosis1.830671472
Mean-90.25348029
Median Absolute Deviation (MAD)8.223297
Skewness-1.141771337
Sum-27076044.09
Variance189.0969607
MonotonicityNot monotonic
2022-11-08T16:02:26.398802image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-83.066183
 
< 0.1%
-82.1598483
 
< 0.1%
-95.6728432
 
< 0.1%
-123.1595122
 
< 0.1%
-74.7009572
 
< 0.1%
-86.6181962
 
< 0.1%
-75.1747792
 
< 0.1%
-94.4416922
 
< 0.1%
-84.6315772
 
< 0.1%
-80.7458222
 
< 0.1%
Other values (298914)299978
> 99.9%
ValueCountFrequency (%)
-166.6706851
< 0.1%
-166.6554251
< 0.1%
-166.6549931
< 0.1%
-166.6497711
< 0.1%
-166.6485771
< 0.1%
-166.6396731
< 0.1%
-166.6335231
< 0.1%
-166.6320721
< 0.1%
-166.6298751
< 0.1%
-166.6274181
< 0.1%
ValueCountFrequency (%)
-66.9509021
< 0.1%
-66.9559961
< 0.1%
-66.9607451
< 0.1%
-66.9639181
< 0.1%
-66.964111
< 0.1%
-66.9716141
< 0.1%
-66.9774751
< 0.1%
-66.9853611
< 0.1%
-66.9860391
< 0.1%
-66.9892541
< 0.1%

age
Real number (ℝ≥0)

Distinct82
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.69269667
Minimum17
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:26.666168image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile25
Q135
median47
Q360
95-th percentile83
Maximum98
Range81
Interquartile range (IQR)25

Descriptive statistics

Standard deviation17.39218856
Coefficient of variation (CV)0.3571826938
Kurtosis-0.1757988633
Mean48.69269667
Median Absolute Deviation (MAD)12
Skewness0.6096193579
Sum14607809
Variance302.488223
MonotonicityNot monotonic
2022-11-08T16:02:26.948927image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5010472
 
3.5%
389177
 
3.1%
378766
 
2.9%
358636
 
2.9%
487934
 
2.6%
467503
 
2.5%
497251
 
2.4%
367234
 
2.4%
327155
 
2.4%
447152
 
2.4%
Other values (72)218720
72.9%
ValueCountFrequency (%)
17500
 
0.2%
181844
 
0.6%
19965
 
0.3%
202
 
< 0.1%
211344
 
0.4%
222262
 
0.8%
234483
1.5%
242964
1.0%
256924
2.3%
261430
 
0.5%
ValueCountFrequency (%)
98136
 
< 0.1%
972
 
< 0.1%
961440
0.5%
951071
0.4%
94920
0.3%
931375
0.5%
92829
0.3%
911080
0.4%
90483
 
0.2%
89702
0.2%

hour
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.81252667
Minimum0
Maximum23
Zeros9746
Zeros (%)3.2%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:27.194094image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q17
median14
Q319
95-th percentile23
Maximum23
Range23
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.818080035
Coefficient of variation (CV)0.5321417245
Kurtosis-1.079681598
Mean12.81252667
Median Absolute Deviation (MAD)5
Skewness-0.2843813962
Sum3843758
Variance46.48621537
MonotonicityNot monotonic
2022-11-08T16:02:27.411212image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
2315637
 
5.2%
2215318
 
5.1%
1815280
 
5.1%
2115272
 
5.1%
1515258
 
5.1%
1915240
 
5.1%
1615233
 
5.1%
1715140
 
5.0%
2015084
 
5.0%
1315046
 
5.0%
Other values (14)147492
49.2%
ValueCountFrequency (%)
09746
3.2%
19966
3.3%
29859
3.3%
39935
3.3%
49736
3.2%
59635
3.2%
69763
3.3%
79777
3.3%
89927
3.3%
99802
3.3%
ValueCountFrequency (%)
2315637
5.2%
2215318
5.1%
2115272
5.1%
2015084
5.0%
1915240
5.1%
1815280
5.1%
1715140
5.0%
1615233
5.1%
1515258
5.1%
1415007
5.0%

day
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.986706667
Minimum0
Maximum6
Zeros59974
Zeros (%)20.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:27.924773image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.202142771
Coefficient of variation (CV)0.7373147139
Kurtosis-1.461086656
Mean2.986706667
Median Absolute Deviation (MAD)2
Skewness-0.008326242623
Sum896012
Variance4.849432785
MonotonicityNot monotonic
2022-11-08T16:02:28.105814image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
059974
20.0%
656329
18.8%
543540
14.5%
142411
14.1%
434518
11.5%
333399
11.1%
229829
9.9%
ValueCountFrequency (%)
059974
20.0%
142411
14.1%
229829
9.9%
333399
11.1%
434518
11.5%
543540
14.5%
656329
18.8%
ValueCountFrequency (%)
656329
18.8%
543540
14.5%
434518
11.5%
333399
11.1%
229829
9.9%
142411
14.1%
059974
20.0%

month
Real number (ℝ≥0)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.136313333
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 MiB
2022-11-08T16:02:28.298109image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.541097825
Coefficient of variation (CV)0.4962082885
Kurtosis-1.250295477
Mean7.136313333
Median Absolute Deviation (MAD)3
Skewness-0.1178102538
Sum2140894
Variance12.53937381
MonotonicityNot monotonic
2022-11-08T16:02:28.489104image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1248634
16.2%
626631
8.9%
525600
8.5%
325036
8.3%
1124612
8.2%
924493
8.2%
1023656
7.9%
423626
7.9%
822580
7.5%
719918
6.6%
Other values (2)35214
11.7%
ValueCountFrequency (%)
118335
6.1%
216879
5.6%
325036
8.3%
423626
7.9%
525600
8.5%
626631
8.9%
719918
6.6%
822580
7.5%
924493
8.2%
1023656
7.9%
ValueCountFrequency (%)
1248634
16.2%
1124612
8.2%
1023656
7.9%
924493
8.2%
822580
7.5%
719918
6.6%
626631
8.9%
525600
8.5%
423626
7.9%
325036
8.3%

is_fraud
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
298444 
1
 
1556

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0298444
99.5%
11556
 
0.5%

Length

2022-11-08T16:02:28.695070image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:28.918382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0298444
99.5%
11556
 
0.5%

Most occurring characters

ValueCountFrequency (%)
0298444
99.5%
11556
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0298444
99.5%
11556
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0298444
99.5%
11556
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0298444
99.5%
11556
 
0.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
278845 
1
 
21155

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0278845
92.9%
121155
 
7.1%

Length

2022-11-08T16:02:29.100457image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:29.293196image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0278845
92.9%
121155
 
7.1%

Most occurring characters

ValueCountFrequency (%)
0278845
92.9%
121155
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0278845
92.9%
121155
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0278845
92.9%
121155
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0278845
92.9%
121155
 
7.1%

category_gas_transport
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
269561 
1
30439 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0269561
89.9%
130439
 
10.1%

Length

2022-11-08T16:02:29.484521image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:29.686347image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0269561
89.9%
130439
 
10.1%

Most occurring characters

ValueCountFrequency (%)
0269561
89.9%
130439
 
10.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0269561
89.9%
130439
 
10.1%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0269561
89.9%
130439
 
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0269561
89.9%
130439
 
10.1%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
289508 
1
 
10492

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0289508
96.5%
110492
 
3.5%

Length

2022-11-08T16:02:29.873885image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:30.078042image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0289508
96.5%
110492
 
3.5%

Most occurring characters

ValueCountFrequency (%)
0289508
96.5%
110492
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0289508
96.5%
110492
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0289508
96.5%
110492
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0289508
96.5%
110492
 
3.5%

category_grocery_pos
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
271478 
1
28522 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0271478
90.5%
128522
 
9.5%

Length

2022-11-08T16:02:30.254786image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:30.456787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0271478
90.5%
128522
 
9.5%

Most occurring characters

ValueCountFrequency (%)
0271478
90.5%
128522
 
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0271478
90.5%
128522
 
9.5%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0271478
90.5%
128522
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0271478
90.5%
128522
 
9.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
280277 
1
 
19723

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0280277
93.4%
119723
 
6.6%

Length

2022-11-08T16:02:30.634498image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:30.835791image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0280277
93.4%
119723
 
6.6%

Most occurring characters

ValueCountFrequency (%)
0280277
93.4%
119723
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0280277
93.4%
119723
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0280277
93.4%
119723
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0280277
93.4%
119723
 
6.6%

category_home
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
271535 
1
28465 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
0271535
90.5%
128465
 
9.5%

Length

2022-11-08T16:02:31.018859image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:31.220168image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0271535
90.5%
128465
 
9.5%

Most occurring characters

ValueCountFrequency (%)
0271535
90.5%
128465
 
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0271535
90.5%
128465
 
9.5%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0271535
90.5%
128465
 
9.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0271535
90.5%
128465
 
9.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
273671 
1
 
26329

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0273671
91.2%
126329
 
8.8%

Length

2022-11-08T16:02:31.410458image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:31.608633image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0273671
91.2%
126329
 
8.8%

Most occurring characters

ValueCountFrequency (%)
0273671
91.2%
126329
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0273671
91.2%
126329
 
8.8%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0273671
91.2%
126329
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0273671
91.2%
126329
 
8.8%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
285256 
1
 
14744

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0285256
95.1%
114744
 
4.9%

Length

2022-11-08T16:02:31.799980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:31.996639image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0285256
95.1%
114744
 
4.9%

Most occurring characters

ValueCountFrequency (%)
0285256
95.1%
114744
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0285256
95.1%
114744
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0285256
95.1%
114744
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0285256
95.1%
114744
 
4.9%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
281500 
1
 
18500

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0281500
93.8%
118500
 
6.2%

Length

2022-11-08T16:02:32.183695image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:32.379151image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0281500
93.8%
118500
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0281500
93.8%
118500
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0281500
93.8%
118500
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0281500
93.8%
118500
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0281500
93.8%
118500
 
6.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
278944 
1
 
21056

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0278944
93.0%
121056
 
7.0%

Length

2022-11-08T16:02:32.563588image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:32.762497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0278944
93.0%
121056
 
7.0%

Most occurring characters

ValueCountFrequency (%)
0278944
93.0%
121056
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0278944
93.0%
121056
 
7.0%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0278944
93.0%
121056
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0278944
93.0%
121056
 
7.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
277642 
1
 
22358

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0277642
92.5%
122358
 
7.5%

Length

2022-11-08T16:02:32.938931image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:33.146565image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0277642
92.5%
122358
 
7.5%

Most occurring characters

ValueCountFrequency (%)
0277642
92.5%
122358
 
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0277642
92.5%
122358
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0277642
92.5%
122358
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0277642
92.5%
122358
 
7.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
272957 
1
 
27043

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0272957
91.0%
127043
 
9.0%

Length

2022-11-08T16:02:33.322478image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:33.532079image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0272957
91.0%
127043
 
9.0%

Most occurring characters

ValueCountFrequency (%)
0272957
91.0%
127043
 
9.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0272957
91.0%
127043
 
9.0%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0272957
91.0%
127043
 
9.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0272957
91.0%
127043
 
9.0%

category_travel
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.3 MiB
0
290536 
1
 
9464

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters300000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0290536
96.8%
19464
 
3.2%

Length

2022-11-08T16:02:33.720147image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-11-08T16:02:33.925423image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0290536
96.8%
19464
 
3.2%

Most occurring characters

ValueCountFrequency (%)
0290536
96.8%
19464
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0290536
96.8%
19464
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
Common300000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0290536
96.8%
19464
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII300000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0290536
96.8%
19464
 
3.2%

Interactions

2022-11-08T16:02:13.945597image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:24.687050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:29.441357image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:34.314167image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:39.539416image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:44.380508image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:49.197388image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:54.243675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:59.102117image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:04.142399image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:08.915928image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:14.403796image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:25.163449image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:29.893713image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:34.824085image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:39.982253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:44.833761image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:49.650459image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:54.700827image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:59.563425image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:04.575264image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:09.348743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:14.848528image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:25.594118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:30.335684image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:35.535598image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:40.430114image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:45.271508image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:50.101804image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:55.136134image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:00.020475image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:05.018868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:09.797035image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:15.282899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:26.009705image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:30.775244image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:35.977941image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:40.857084image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:45.712970image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:50.531360image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:55.566469image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:00.461349image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:05.454199image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:10.231980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:15.722671image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:26.430456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:31.205723image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:36.412862image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:41.274630image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:46.145512image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:50.954976image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:55.997328image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:00.899683image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:05.878083image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:10.657953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:16.169382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:26.867722image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:31.666675image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:36.859730image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:41.724427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:46.592913image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:51.598304image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:56.441043image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:01.373986image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:06.323347image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:11.350792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:16.590899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:27.280395image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:32.120879image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:37.296149image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:42.164981image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:47.022453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:52.031009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:56.927278image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:01.818617image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:06.750223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:11.784621image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:17.015146image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:27.713936image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:32.556912image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:37.742017image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:42.602058image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:47.435205image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:52.470392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:57.353642image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:02.259025image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:07.185827image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:12.207501image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:17.475280image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:28.173396image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:33.004864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:38.226276image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:43.059222image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:47.899587image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:52.921795image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:57.807211image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:02.760209image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:07.632559image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:12.674168image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:17.895024image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:28.587002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:33.435925image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:38.658112image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:43.517373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:48.332743image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:53.353312image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:58.249187image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:03.209409image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:08.055585image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:13.087604image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:18.321369image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:29.009050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:33.870750image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:39.106996image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:43.974118image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:48.762758image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:53.819946image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:01:58.686905image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:03.676228image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:08.487720image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-11-08T16:02:13.516725image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-11-08T16:02:34.172305image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-11-08T16:02:34.848519image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-11-08T16:02:35.591641image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-11-08T16:02:36.170402image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-11-08T16:02:36.689078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-11-08T16:02:18.866061image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-11-08T16:02:20.512199image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

amtziplatlongcity_popmerch_latmerch_longagehourdaymonthis_fraudcategory_food_diningcategory_gas_transportcategory_grocery_netcategory_grocery_poscategory_health_fitnesscategory_homecategory_kids_petscategory_misc_netcategory_misc_poscategory_personal_carecategory_shopping_netcategory_shopping_poscategory_travel
0167.429352937.7773-119.082563338.492626-118.67723595821200001000000000
146.913976933.3570-89.0473192333.193352-90.01705862115400100000000000
28.461050441.1360-73.7009798741.493080-74.290518581041100000000000010
3112.716366537.3272-91.024324136.342555-91.407343481841000000010000000
456.416468639.7417-93.628927139.435902-93.06493150226100000010000000
543.077741229.6047-96.524910629.645347-96.489121392161000000001000000
616.316267340.0994-89.960153040.703103-90.35985755144100000000001000
7229.846207539.3036-89.2853345839.297215-88.85933437133300000000010000
8124.205775643.3526-102.5411112642.944973-101.57796642111900000000000010
9169.732510638.8265-82.136464239.612086-82.36522976106600000000100000

Last rows

amtziplatlongcity_popmerch_latmerch_longagehourdaymonthis_fraudcategory_food_diningcategory_gas_transportcategory_grocery_netcategory_grocery_poscategory_health_fitnesscategory_homecategory_kids_petscategory_misc_netcategory_misc_poscategory_personal_carecategory_shopping_netcategory_shopping_poscategory_travel
2999903.304803442.4969-83.29117583043.099145-83.93494048126600000000100000
29999169.769781345.8289-118.4971130246.392243-118.52567446131300000000000000
29999256.67801439.8016-75.347850440.177050-75.24982942641100010000000000
29999311.663892233.9215-89.6782345134.154141-89.23840938230100000010000000
29999445.878754336.1486-105.664824735.918408-105.01124461501200100000000000
29999535.94835039.4850-74.877682538.762023-74.95772531116900000000010000
29999688.586721637.6223-97.313640965638.444081-97.4887549246600001000000000
29999716.637683431.8287-99.4270590831.973577-98.9345456016300000000010000
29999840.917463336.6966-96.786947136.753618-95.91474381451100000000100000
29999978.067509233.6372-96.61844656333.756980-96.0787555246300100000000000